NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Cross-silo Federated Learning with Record-level Personalized Differential Privacy

https://doi.org/10.1145/3658644.3670351

Liu, Junxu; Lou, Jian; Xiong, Li; Liu, Jinfei; Meng, Xiaofeng (December 2024, ACM)

Full Text Available
Efficient Sampling Approaches to Shapley Value Approximation

https://doi.org/10.1145/3588728

Zhang, Jiayao; Sun, Qiheng; Liu, Jinfei; Xiong, Li; Pei, Jian; Ren, Kui (May 2023, Proceedings of the ACM on Management of Data)

Shapley value provides a unique way to fairly assess each player's contribution in a coalition and has enjoyed many applications. However, the exact computation of Shapley value is #P-hard due to the combinatoric nature of Shapley value. Many existing applications of Shapley value are based on Monte-Carlo approximation, which requires a large number of samples and the assessment of utility on many coalitions to reach high quality approximation, and thus is still far from being efficient. Can we achieve an efficient approximation of Shapley value by smartly obtaining samples? In this paper, we treat the sampling approach to Shapley value approximation as a stratified sampling problem. Our main technical contributions are a novel stratification design and two sample allocation methods based on Neyman allocation and empirical Bernstein bound, respectively. Experimental results on several real data sets and synthetic data sets demonstrate the effectiveness and efficiency of our novel stratification design and sampling approaches.
more » « less
Full Text Available
Equitable Data Valuation Meets the Right to Be Forgotten in Model Markets

https://doi.org/10.14778/3611479.3611531

Xia, Haocheng; Liu, Jinfei; Lou, Jian; Qin, Zhan; Ren, Kui; Cao, Yang; Xiong, Li (July 2023, Proceedings of the VLDB Endowment)

The increasing demand for data-driven machine learning (ML) models has led to the emergence of model markets, where a broker collects personal data from data owners to produce high-usability ML models. To incentivize data owners to share their data, the broker needs to price data appropriately while protecting their privacy. For equitable data valuation , which is crucial in data pricing, Shapley value has become the most prevalent technique because it satisfies all four desirable properties in fairness: balance, symmetry, zero element, and additivity. For the right to be forgotten , which is stipulated by many data privacy protection laws to allow data owners to unlearn their data from trained models, the sharded structure in ML model training has become a de facto standard to reduce the cost of future unlearning by avoiding retraining the entire model from scratch. In this paper, we explore how the sharded structure for the right to be forgotten affects Shapley value for equitable data valuation in model markets. To adapt Shapley value for the sharded structure, we propose S-Shapley value, a sharded structure-based Shapley value, which satisfies four desirable properties for data valuation. Since we prove that computing S-Shapley value is #P-complete, two sampling-based methods are developed to approximate S-Shapley value. Furthermore, to efficiently update valuation results after data owners unlearn their data, we present two delta-based algorithms that estimate the change of data value instead of the data value itself. Experimental results demonstrate the efficiency and effectiveness of the proposed algorithms.
more » « less
Full Text Available
ShapleyFL: Robust Federated Learning Based on Shapley Value

https://doi.org/10.1145/3580305.3599500

Sun, Qiheng; Li, Xiang; Zhang, Jiayao; Xiong, Li; Liu, Weiran; Liu, Jinfei; Qin, Zhan; Ren, Kui (August 2023, KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Full Text Available
Dynamic Shapley Value Computation

https://doi.org/10.1109/ICDE55515.2023.00055

Zhang, Jiayao; Xia, Haocheng; Sun, Qiheng; Liu, Jinfei; Xiong, Li; Pei, Jian; Ren, Kui (April 2023, 2023 IEEE 39th International Conference on Data Engineering (ICDE))

Full Text Available
Projected federated averaging with heterogeneous differential privacy

https://doi.org/10.14778/3503585.3503592

Liu, Junxu; Lou, Jian; Xiong, Li; Liu, Jinfei; Meng, Xiaofeng (December 2021, Proceedings of the VLDB Endowment)

Federated Learning (FL) is a promising framework for multiple clients to learn a joint model without directly sharing the data. In addition to high utility of the joint model, rigorous privacy protection of the data and communication efficiency are important design goals. Many existing efforts achieve rigorous privacy by ensuring differential privacy for intermediate model parameters, however, they assume a uniform privacy parameter for all the clients. In practice, different clients may have different privacy requirements due to varying policies or preferences. In this paper, we focus on explicitly modeling and leveraging the heterogeneous privacy requirements of different clients and study how to optimize utility for the joint model while minimizing communication cost. As differentially private perturbations affect the model utility, a natural idea is to make better use of information submitted by the clients with higher privacy budgets (referred to as "public" clients, and the opposite as "private" clients). The challenge is how to use such information without biasing the joint model. We propose P rojected F ederated A veraging (PFA), which extracts the top singular subspace of the model updates submitted by "public" clients and utilizes them to project the model updates of "private" clients before aggregating them. We then propose communication-efficient PFA+, which allows "private" clients to upload projected model updates instead of original ones. Our experiments verify the utility boost of both algorithms compared to the baseline methods, whereby PFA+ achieves over 99% uplink communication reduction for "private" clients.
more » « less
Full Text Available
Dealer: an end-to-end model marketplace with differential privacy

https://doi.org/10.14778/3447689.3447700

Liu, Jinfei; Lou, Jian; Liu, Junxu; Xiong, Li; Pei, Jian; Sun, Jimeng (February 2021, Proceedings of the VLDB Endowment)
null (Ed.)
Data-driven machine learning has become ubiquitous. A marketplace for machine learning models connects data owners and model buyers, and can dramatically facilitate data-driven machine learning applications. In this paper, we take a formal data marketplace perspective and propose the first en D -to-end mod e l m a rketp l ace with diff e rential p r ivacy ( Dealer ) towards answering the following questions: How to formulate data owners' compensation functions and model buyers' price functions? How can the broker determine prices for a set of models to maximize the revenue with arbitrage-free guarantee, and train a set of models with maximum Shapley coverage given a manufacturing budget to remain competitive ? For the former, we propose compensation function for each data owner based on Shapley value and privacy sensitivity, and price function for each model buyer based on Shapley coverage sensitivity and noise sensitivity. Both privacy sensitivity and noise sensitivity are measured by the level of differential privacy. For the latter, we formulate two optimization problems for model pricing and model training, and propose efficient dynamic programming algorithms. Experiment results on the real chess dataset and synthetic datasets justify the design of Dealer and verify the efficiency and effectiveness of the proposed algorithms.
more » « less
Full Text Available
PGLP: Customizable and Rigorous Location Privacy Through Policy Graph

https://doi.org/10.1007/978-3-030-58951-6_32

Cao, Yang; Xiao, Yonghui; Takagi, Shun; Xiong, Li; Yoshikawa, Masatoshi; Shen, Yilin; Liu, Jinfei; Jin, Hongxia; Xu, Xiaofeng (September 2020, Computer Security – ESORICS 2020)
null (Ed.)
Full Text Available

Search for: All records